System and method for on-network storage services

ABSTRACT

A method for managing on-network data storage using a communication network. Requests for data are received within an intermediary server from a plurality of external client applications coupled to the network. Units of data are stored in one or more data storage devices accessible to the intermediary server. Each storage request is associated with a token representing the request. The token is sent to a storage management server coupled to the network and having an interface for communicating with the intermediary server. The storage management server returns specific location information corresponding to the request associated with the received token. The intermediary server accesses the data storage mechanism using the specific location information to retrieve data at the specific location. The retrieved data is delivered to the client application that generated the request.

RELATED APPLICATIONS

[0001] The present invention claims priority from U.S. ProvisionalPatent Application No. 60/197,490 entitled CONDUCTOR GATEWAY filed onApr. 17, 2000

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention.

[0003] The present invention relates, in general, to network informationaccess and, more particularly, to software, systems and methods forproviding database services in a coordinated fashion from multiplecooperating database servers.

[0004] 2. Relevant Background.

[0005] Increasingly, business data processing systems, entertainmentsystems, and personal communications systems are implemented bycomputers across networks that are interconnected by internetworks(e.g., the Internet). The Internet is rapidly emerging as the preferredsystem for distributing and exchanging data. Data exchanges supportapplications including electronic commerce, broadcast and multicastmessaging, videoconferencing, gaming, and the like. In electroniccommerce (e-commerce) applications, it is important to provide asatisfying buying experience that leads to a purchase transaction. Toprovide this high level of service, a web site operator must ensure thatdata is delivered to the customer in the most timely, usable andefficient fashion.

[0006] The Internet is a collection of disparate computers and networkscoupled together by a web of interconnections using standardizedcommunications protocols. While most Internet access is currentlyperformed using conventional personal computers and workstations, thevariety of devices that access the Internet is growing quickly andexpected to continue to grow. It is expected that a variety ofappliances and devices within offices, businesses, and households willsupport Internet connectivity in the coming years. A major segment ofgrowth is in the area of lightweight computing appliances. Examplesinclude wireless telephones, personal digital assistants (PDAs), digitalpicture frames, digital music, and digital movies among other examples.These devices are characterized by little or no mass storage capability.In such devices there is increased need to access external mass storagesuch as network storage devices to access information needed to performtheir functions.

[0007] The Internet is characterized by its vast reach as a result ofits wide and increasing availability and easy access protocols.Unfortunately, the ubiquitous nature of the Internet results in variablebandwidth and quality of service between points. The latency andreliability of data transport is largely determined by the total amountof traffic on the Internet and so varies wildly seasonally andthroughout the day. Other factors that affect quality of service includeequipment outages and line degradation that force packets to bererouted, damaged and/or dropped. Also, routing software and hardwarelimitations within the Internet infrastructure may create bandwidthbottlenecks even when the mechanisms are operating withinspecifications. The variable nature of the quality of service (QOS)provided by the Internet has made development and deployment of databasesystem that leverage the Internet infrastructure difficult.

[0008] With the advent of the Internet, computing appliances that canpotentially act as interfaces to a database have potentially ubiquitousaccess to this stored database information. The Internet promises toenable ready access from a wide variety of computing appliances at awide variety of locations. Typically, when data is stored on a networkit is stored at a location associated with a network service thatadministers that data. For example, MP3 music files may be stored in acentralized database that stores only MP3 files. Digital movies orpresentation materials are stored on specific servers that administerrequests for those materials. This enables the administering server toregulate, control, and charge for access to the data.

[0009] Managing access to data files often involves a disparity betweenthe resources required to perform the administrative and managementfunctions and the resources required to serve the data efficiently.Management functions such as receiving requests, locating files,recording metadata describing who, when and where the files wereaccessed, account management and billing tend to involve relativelysmall volumes of data that are efficiently handled by a processor withfast access to an administrative database. In contrast, the actual datafile delivery involves larger data units with transactions and arebeneficially performed by a processor with a low latency connection tothe end-user that is receiving the data.

[0010] However, the conventional close-coupling between the servicesthat manage the data and the data store itself restricts theaccessibility of the data. This results in data stores being locatedbehind a database management engine at a location that is not optimalfor delivery of data to end users and increases the cost of transportingthe data. Alternatively, management functions can be replicated acrossmultiple servers requiring coordination, synchronization and addedcomplexity. A need exists for on-network data storage systems andmethods that efficiently perform the disparate tasks associated withdata storage and management.

[0011] Beyond varying functional requirements for data storage andaccess, there are increasing political, security, legislative andavailability criteria that influence where certain data is physicallystored or across what borders it is transported. For example,politically sensitive data may not be permitted in some jurisdictions.In another example, a law firm may wish that all client data bephysically stored on servers within its control. Until now, such datastorage solutions could not be managed by external services. Forexample, if the data owner wished to make data available for aper-access charge, the owner would be forced to implement the chargingmechanisms on its own servers, or compromise the desired data storagecriteria by replicating the data onto the servers of an external serviceprovider. Hence, a need exists for systems and methods that enable anexternal service provider to provide data management and access servicesto data that is physically stored on data-owner controlled storagemechanisms.

SUMMARY OF THE INVENTION

[0012] Briefly stated, the present invention involves a method andsystem for managing on-network data storage using a communicationnetwork. Requests for data are received within an intermediary serverfrom a plurality of external client applications coupled to the network.Units of data are stored in one or more data storage devices accessibleto the intermediary server. Each storage request is associated with atoken representing the request. The token is sent to a storagemanagement server coupled to the network and having an interface forcommunicating with the intermediary server. The storage managementserver returns specific location information corresponding to therequest associated with the received token. The intermediary serveraccesses the data storage mechanism using the specific locationinformation to retrieve data at the specific location. The retrieveddata is delivered to the client application that generated the request.

BRIEF DESCRIPTION OF THE DRAWINGS

[0013]FIG. 1 illustrates a general distributed computing environment inwhich the present invention is implemented;

[0014]FIG. 2 shows in block-diagram form significant components of asystem in accordance with the present invention;

[0015]FIG. 3 shows a network architecture of components implementing theon-network data storage system in accordance with the present invention;

[0016]FIG. 4 shows front-end components of FIG. 2 in greater detail;

[0017]FIG. 5 illustrates entity relationships and data exchanges in afirst type of storage access in accordance with the present invention;and

[0018]FIG. 6 illustrates entity relationships and data exchanges in asecond type of storage access in accordance with the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0019] The present invention is illustrated and described in terms of adistributed computing environment such as an enterprise computing systemusing public communication channels such as the Internet. However, animportant feature of the present invention is that it is readily scaledupwardly and downwardly to meet the needs of a particular application.Accordingly, unless specified to the contrary, the present invention isapplicable to significantly larger, more complex network environments,including wireless network environments, as well as small networkenvironments such as conventional LAN systems.

[0020] In accordance with the present invention, some or all of the datastorage normally implemented at the web site implementing the datamanagement processes are instead implemented in a front-end server thatenjoys a lower latency connection to an end user. The administrationservices are handled centrally while the data storage is handled morelocally to a client.

[0021] In another respect, the present invention enables separationbetween the tasks involved in physical storage and the tasks involved inmanaging access to the physical storage. From a security perspective,this enables data to be placed in a physical storage location that meetscriteria defined by the data owner such as physical location,topological location, legal jurisdictions, and the like. Physical datastorage may actually be implemented on the data owner's machine, or onone or more network storage device(s) that provides suitable accesscontrol or security provisions appropriate for the data. At the sametime, data administration services can be implemented in a networkservice node independent of the physical storage. In this manner, a userdata requests can be metered and managed by the network service nodewhile the data owner does not need to compromise the desired storagecriteria.

[0022] Specific implementations of the present invention involve the useof “web server” software and hardware to implement intermediary servers.For purposes of this document, a web server is a computer running serversoftware coupled to the World Wide Web (i.e., “the web”) that deliversor serves web pages. The web server has a unique IP address and acceptsconnections in order to service requests by sending back responses. Aweb server differs from a proxy server or a gateway server in that a webserver has resident a set of resources (i.e., software programs, datastorage capacity, and/or hardware) that enable it to serve web pagesusing the resident resources whereas a proxy or gateway is anintermediary program that makes requests on behalf of a client toresources that reside elsewhere. A web server in accordance with thepresent invention may reference external resources of the same ordifferent type as the services requested by a user, and reformat andaugment what is provided by the external resources in its response tothe user. Commercially available web server software includes MicrosoftInternet Information Server (IIS), Netscape Netsite, Apache, amongothers. Alternatively, a web site may be implemented with custom orsemi-custom software that supports HTTP traffic.

[0023]FIG. 1 shows an exemplary computing environment 100 in which thepresent invention may be implemented. Environment 100 includes aplurality of local networks such as Ethernet network 102, FDDI network103 and Token Ring network 104. Essentially, a number of computingdevices and groups of devices are interconnected through a network 101.For example, local networks 102, 103 and 104 are each coupled to network101 through routers 109. LANs 102, 103 and 104 may be implemented usingany available topology and may implement one or more server technologiesincluding, for example UNIX, Novell, or Windows NT networks, includingboth client-server and peer-to-peer type networks. Each network willinclude distributed storage implemented in each device and typicallyincludes some mass storage device coupled to or managed by a servercomputer. Network 101 comprises, for example, a public network such asthe Internet or another network mechanism such as a fibre channel fabricor conventional WAN technologies.

[0024] Local networks 102, 103 and 104 include one or more networkappliances 107. One or more network appliances 107 may be configured asan application and/or file server. Each local network 102, 103 and 104may include a number of shared devices (not shown) such as printers,file servers, mass storage and the like. Similarly, devices 111 may beshared through network 101 to provide application and file services,directory services, printing, storage, and the like. Routers 109 providea physical connection between the various devices through network 101.Routers 109 may implement desired access and security protocols tomanage access through network 101.

[0025] Network appliances 107 may also couple to network 101 throughpublic switched telephone network 108 using copper or wirelessconnection technology. In a typical environment, an Internet serviceprovider 106 supports a connection to network 101 as well as PSTN 108connections to network appliances 107. The present invention may beparticularly useful wireless applications because many wirelessappliances 107 have limited local data storage capability which makesobtaining external data more frequent and important. The presentinvention enables the data to be stored nearer to the wirelessappliance, for example in ISP 106, but managed by any network-connectedserver 111 or appliance 107.

[0026] Network appliances 107 may be implemented as any kind of networkappliance having sufficient computational function to execute softwareneeded to establish and use a connection to network 101. Networkappliances 107 may comprise workstation and personal computer hardwareexecuting commercial operating systems such as Unix variants, MicrosoftWindows, Macintosh OS, and the like. At the same time, some appliances107 comprise portable or handheld devices using wireless connectionsthrough a wireless access provider such as personal digital assistantsand cell phones executing operating system software such as PalmOS,WindowsCE, EPOCOS and the like. Moreover, the present invention isreadily extended to network devices such as office equipment, vehicles,and personal communicators that make occasional connection throughnetwork 101.

[0027] Each of the devices shown in FIG. 1 may include memory, massstorage, and a degree of data processing capability sufficient to managetheir connection to network 101. The computer program devices inaccordance with the present invention are implemented in the memory ofthe various devices shown in FIG. 1 and enabled by the data processingcapability of the devices shown in FIG. 1. In addition to local memoryand storage associated with each device, it is often desirable toprovide one or more locations of shared storage such as disk farm (notshown) that provides mass storage capacity beyond what an individualdevice can efficiently use and manage. Selected components of thepresent invention may be stored in or implemented in shared massstorage.

[0028] In one embodiment, the present invention operates in a mannerakin to a private network 200 implemented within the Internetinfrastructure. This private network 200 is used to transport databetween clients 205 and data servers 210, and/or to transport managementand access control information between storage management servers 212and the data servers 210 and clients 205. In essence, the privatenetwork 200 enables the split of physical storage (implemented bystorage server 210 and data store 211) and storage management and accesscontrol (implemented by storage management server 212) contemplated bythe present invention.

[0029] Private network 200 expedites and prioritizes communicationsbetween a client 205 and a data server 210. In the exemplaryimplementations, two intermediary computers, front-end 201 and back-end203 are used cooperatively as intermediary servers to process databaseaccess requests and provide data services. However, it is contemplatedthat a single intermediary computer (i.e., either front-end 201 orback-end 203) may be used and still provide improved access to a dataserver 210. Further, it is also contemplated that FE201 and data server210 reside in the same physical location.

[0030] In the specific examples herein client 205 comprises anetwork-enabled graphical user interface such as a web browser. However,the present invention is readily extended to client software other thanconventional web browser software. Any client application that canaccess a standard or proprietary user level protocol for network accessis a suitable equivalent. Examples include client applications that actas front ends for file transfer protocol (FTP) services, extensiblemarkup language (XML) services, Voice over Internet protocol (VOIP)services, network news protocol (NNTP) services, multi-purpose internetmail extensions (MIME) services, post office protocol (POP) services,simple mail transfer protocol (SMTP) services, as well as Telnetservices. In addition to network protocols, the client application mayserve as a front-end for a network application such as a databasemanagement system (DBMS) in which case the client application generatesquery language (e.g., structured query language or “SQL”) messages. Inwireless appliances, a client application functions as a front-end to awireless application protocol (WAP) service.

[0031] Data server 210 implements connectivity to network devices suchas back-end 203 to receive and process requests for data from data store211. Data server 210 can be implemented as a database includingrelational, flat, and object oriented databases. Alternatively, dataserver 210 may comprise a virtual database that accesses one or moreother databases. Further, data server 210 may be a data storage deviceor network file system that responds to requests by fetching data.

[0032] Front-end mechanism 201 serves as an access point for client-sidecommunications. In one example, front-end 201 comprises a computer thatsits “close” to clients 205. By “close”, “topologically close” and“logically close” it is meant that the average latency associated with aconnection between a client 205 and a front-end 201 is less than theaverage latency associated with a connection between a client 205 and aserver 210. Desirably, frontend computers have as fast a connection aspossible to the clients 205. For example, the fastest availableconnection may be implemented in point of presence (POP) of an Internetservice provider (ISP) 106 used by a particular client 205. However, theplacement of the front-ends 201 can limit the number of browsers thatcan use them. Because of this, in some applications it is more practicalto place one front-end computer in such a way that several POPs canconnect to it. Greater distance between front-end 201 and clients 205may be desirable in some applications as this distance will allow forselection amongst a greater number of front-ends 201 and thereby providesignificantly different routes to a particular back-end 203. This mayoffer benefits when particular routes and/or front-ends become congestedor otherwise unavailable.

[0033] Transport mechanism 202 is implemented by cooperative actions ofthe front-end 201 and back-end 203. Back-end 203 processes and directsdata communication to and from data server 210. Transport mechanism 202communicates data packets using a proprietary protocol over the publicInternet infrastructure in the particular example. Hence, the presentinvention does not require heavy infrastructure investments andautomatically benefits from improvements implemented in thegeneral-purpose network 101. Unlike the general-purpose Internet,front-end 201 and back-end 203 are programmably assigned to serveaccesses to a particular data server 210 at any given time.

[0034] It is contemplated that any number of front-end and back-endmechanisms may be implemented cooperatively to support the desired levelof service required by the data server owner. The present inventionimplements a many-to many mapping of front-ends 201 to back-ends 203.Because the front-end to back-end mappings can by dynamically changed, afixed hardware infrastructure can be logically reconfigured to map moreor fewer front-ends to more or fewer back-ends and web sites or serversas needed.

[0035] In one embodiment, front-end 201 and back-end 203 are closelycoupled to the Internet backbone. This means they have high bandwidthconnections, can expect fewer hops, and have more predictable packettransit time than could be expected from a general-purpose connection.Although it is preferable to have low latency connections betweenfront-ends 201 and back-ends 203, a particular strength of the presentinvention is its ability to deal with latency by enabling efficienttransport and traffic prioritization. Hence, in other embodimentsfront-end 201 and/or back-end 203 may be located farther from theInternet backbone and closer to clients 205 and/or data servers 210.Such an implementation reduces the number of hops required to reach afront-end 201 while increasing the number of hops within the TMP link202 thereby yielding control over more of the transport path to themanagement mechanisms of the present invention.

[0036] Clients 205 no longer conduct all data transactions directly withthe data server 210. Instead, clients 205 conduct some and preferably amajority of transactions with front-ends 201, which access the functionsof data server 210. Client data is then sent, using TMP link 202, to theback-end 203 and then to the server 210. Running multiple clients 205over one large connection provides several advantages:

[0037] Since all client data is mixed, each client can be assigned apriority. Higher priority clients, or clients requesting higher prioritydata, can be given preferential access to network resources so theyreceive access to the channel sooner while ensuring low-priority clientsreceive sufficient service to meet their needs.

[0038] The large connection between a front-end 201 and back-end 203 canbe permanently maintained, shortening the many TCP/IP connectionsequences normally required for many clients connecting anddisconnecting.

[0039] A particular advantage of the architecture shown in FIG. 2 isthat it is readily scaled. In accordance with the present invention, notonly can the data itself be distributed, but the data servicefunctionality and behavior is readily and dynamically ported to any of anumber of intermediary computers in contrast to conventional databasesystems where the database functionality is confined to a particularserver or limited set of servers. In this manner, any number of clientmachines 205 may be supported. In a similar manner, a database owner maychoose to use multiple data servers 210 that are co-located ordistributed throughout network 101. To avoid congestion, additionalfront-ends 201 may be implemented or assigned to particular dataservers. Each front-end 201 is dynamically reconfigurable by updatingaddress parameters to serve particular data servers. Client traffic isdynamically directed to available front-ends 201 to provide loadbalancing. Hence, when quality of service drops because of a largenumber of client accesses to a particular data server, an additionalfront-end 201 can be assigned to the data server and subsequent clientrequests directed to the newly assigned front-end 201 to distributetraffic across a broader base.

[0040] In the particular examples, this is implemented by a front-endmanager component 207 that communicates with multiple front-ends 201 toprovide administrative and configuration information to front-ends 201.Each frontend 201 includes data structures for storing the configurationinformation, including information identifying the IP addresses of dataservers 210 to which they are currently assigned. Other administrativeand configuration information stored in front-end 201 may includeinformation for prioritizing data from and to particular clients,quality of service information, and the like.

[0041] Similarly, additional back-ends 203 can be assigned to a dataserver to handle increased traffic. Back-end manager component 209couples to one or more back-ends 203 to provide centralizedadministration and configuration service. Back-ends 203 include datastructures to hold current configuration state, quality of serviceinformation and the like. In the particular examples front-end manager207 and back-end manager 209 serve multiple data server 210 and so areable to manipulate the number of front-ends and back-ends assigned toeach data server 210 by updating this configuration information. Whenthe congestion for the data server subsides, the front-end 201 andback-end 203 can be reassigned to other, busier data servers. These andsimilar modifications are equivalent to the specific examplesillustrated herein.

[0042] In the case of web-based environments, front-end 201 isimplemented using custom or off-the-shelf web server software. Front-end201 is readily extended to support other, non-web-based protocols,however, and may support multiple protocols for varieties of clienttraffic. Front-end 201 processes the data traffic it receives,regardless of the protocol of that traffic, to a form suitable fortransport by TMP 202 to a back-end 203. Hence, most of the functionalityimplemented by front-end 201 is independent of the protocol or format ofthe data received from a client 205. Hence, although the discussion ofthe exemplary embodiments herein relates primarily to front-end 201implemented as a web server, it should be noted that, unless specifiedto the contrary, web-based traffic management and protocols are merelyexamples and not a limitation of the present invention.

[0043] As shown in FIG. 2, in accordance with the present invention dataaccess services are implemented using an originating data server 210operating cooperatively with the server of front-end 201 and a storagemanagement server 212. Some or all of the functions of storagemanagement server 212 may be implemented in a front-end 201, althoughsuch implementation diverges from the centralized storage managementprovided by storage management server 212.

[0044] Front-ends 201 alone or in cooperation with one or more back-ends203 function as intermediary servers 206 as shown in FIG. 3. Theabstraction of FIG. 3 simplifies the complexity of the implementation ofprivate network 200. This abstraction is useful because the intermediaryservers 206 integrate the otherwise separated physical storage andstorage management components such that a client 205 can make datarequests and receive data responses without requiring knowledge of theprivate network 200 shown in FIG. 2. In the preferred implementations,the physical data store 211 is coupled to an intermediary server 206 bya low latency connection. This low latency connection may be aconnection through private network 200 as shown in FIG. 2, as well asLAN, WAN, MAN or SAN connections. It is also contemplated that physicaldata store 211 may be directly connected to intermediary server 206.Regardless of the location of physical data store 211, intermediaryserver 206 references storage management server 212 in order to locateparticular files needed to respond to requests.

[0045] In order for a client 205 to obtain service from an intermediaryserver 206, it must first be directed to an intermediary server 206(e.g., a front-end server 201) that can provide the desired service.Preferably, client 205 does not need to be aware of the location ofintermediary server 206, and initiates all transactions as if it werecontacting the storage server 210. In a particular implementation, adomain name server (DNS) redirection mechanism is used to connect aclient 205 to a particular intermediary server 206. The DNS systems isdefined in a variety of Internet Engineering Task Force (IETF) documentssuch as RFC0883, RFC 1034 and RFC 1035 which are incorporated byreference herein. In this implementation, at least one DNS server 307 isowned and controlled by system components of the present invention. Whena user accesses a network resource (e.g., a makes a data request),client 205 contacts the public DNS system to resolve the requesteddomain name into its related IP address in a conventional manner. In afirst embodiment, the public DNS performs a conventional DNS resolutiondirecting the browser to an originating server 210 and server 210performs a redirection of the browser to the system owned DNS server(i.e., DNC_C in FIG. 3). In a second embodiment, domain:address mappingswithin the DNS system are modified such that resolution of theoriginating server's domain automatically return the address of thesystem-owned DNS server (DNS₁₃ C) . Once a browser is redirected to thesystem-owned DNS server, it begins a process of further redirecting thebrowser 301 to a selected intermediary server 206. The intermediaryserver 206 may be selected based on contents of its local storage, orother criteria.

[0046] Primary functions of the intermediary server 206 includeresponding to data requests from clients 205 by identifying the locationof the stored data, accessing the stored data (i.e., performing readand/or write operations), and communicating results of the data accessto the requesting client 205. Optionally, intermediary server 206 mayprioritize amongst multiple queries, and resolving the queries in anorder based upon the prioritization. It is contemplated that the variousfunctions described in reference to the specific examples may beimplemented using a variety of data structures and programs operating atany location in a distributed network. For example, a front-end 201 orintermediary server 206 may be operated on a network appliance 107 orserver within a particular network 102, 103, or 104 shown in FIG. 1.

[0047]FIG. 4 specifically illustrates components of a front-end 201,however, it should be understood that the components are largely similarto an intermediary server 206 with the variations noted herein. Back-end203 provides complementary services and functions to front-end 201, andis not illustrated separately herein.

[0048] TCP component 401 includes devices for implementing physicalconnection layer and Internet protocol (IP) layer functionality. CurrentIP standards are described in IETF documents RFC0791, RFC0950, RFC0919,RFC0922, RFC792, RFC1112 that are incorporated by reference herein. Forease of description and understanding, these mechanisms are notdescribed in great detail herein. Where protocols other than TCP/IP areused to couple to a client 205, TCP component 401 is replaced oraugmented with an appropriate network protocol process.

[0049] TCP component 401 communicates TCP packets with one or moreclients 205. Received packets are coupled to parser 402 where theInternet protocol (or equivalent) information is extracted. TCP isdescribed in IETF RFC0793 which is incorporated herein by reference.Each TCP packet includes header information that indicates addressingand control variables, and a payload portion that holds the user-leveldata being transported by the TCP packet. The user-level data in thepayload portion typically comprises a user-level network protocoldatagram.

[0050] Parser 402 analyzes the payload portion of the TCP packet. In theexamples herein, HTTP is employed as the user-level protocol because ofits widespread use and the advantage that currently available browsersoftware is able to readily use the HTTP protocol. In this case, parser402 comprises an HTTP parser. More generally, parser 402 can beimplemented as any parser-type logic implemented in hardware or softwarefor interpreting the contents of the payload portion. Parser 402 mayimplement file transfer protocol (FTP), mail protocols such as simplemail transport protocol (SMTP) and the like. Any user-level protocol,including proprietary protocols, may be implemented within the presentinvention using appropriate modification of parser 402.

[0051] To improve performance, front-end 201 optionally includes acaching mechanism 403. Cache 403 may be implemented as a passive cachethat stores frequently and/or recently accessed database content or asan active cache that stores database content that is anticipated to beaccessed. Upon receipt of a TCP packet, HTTP parser 402 determines ifthe packet is making a request for data within cache 403. If the requestcan be satisfied from cache 403 the data is supplied directly withoutreference to data server 210 (i.e., a cache hit). Cache 403 implementsany of a range of management functions for maintaining fresh content.For example, cache 403 may invalidate portions of the cached contentafter an expiration period specified with the cached data or by datasever 210. Also, cache 403 may proactively update the cache contentseven before a request is received for particularly important orfrequently used data from data server 210. Cache 403 evicts informationusing any desired algorithm such as least recently used, leastfrequently used, first in/first out, or random eviction. When therequested data is not within cache 403, a request is processed to dataserver 210, and the returned data may be stored in cache 403.

[0052] The formulated query is passed to storage server interface 405which handles communication with storage server 210 and storagemanagement server 212. Channel 202 is compatible with an interface todata server 210 which may include a TCP/IP interface as well asEthernet, Fibre channel, or other available public or proprietaryphysical and transport layer interfaces.

[0053] Storage server 210 and storage management server 212 returnresponses to interface 405 which are then supplied to data filter 406and/or HTTP reassemble component 407. Data filter component 406 mayfilter and/or constrain database contents returned in the response. Datafilter component 406 is optionally used to implement data decompressionwhere appropriate, decryption, and handle caching when the returningdata is of a cacheable type. HTTP reassemble component 407 formats theresponse into a format suitable for use by client 205, which in theparticular examples herein comprises a web page transported via HTTP.

[0054] Where a front-end 201 and back-end 203 together are used toimplemented an intermediary server 206, front-end 201 is responsible fortranslating transmission control protocol (TCP) packets from client 205into transmission morphing protocol (TMP) packets used in the system inaccordance with the present invention. Transport morphing protocol andTMP are trademarks or registered trademarks of Circadence Corporation inthe United States and other countries. TMP packets comprise multipleblended requests generated by data blender 404. Blender 404 slicesand/or coalesces the data portions of the received packets into a moredesirable “TMP units” that are sized for transport through the TMPmechanism 202. The data portion of TCP packets may range in sizedepending on client 205 and any intervening links coupling client 205 toTCP component 401. Moreover, where compression is applied the compresseddata will vary in size depending on the compressibility of the data.Data blender 404 receives information from front-end manager 207 thatenables selection of a preferable TMP packet size. Alternatively, afixed TMP packet size can be set that yields desirable performanceacross TMP mechanism 202. Data blender 404 also marks the TMP units sothat they can be re-assembled at the receiving end.

[0055] Data blender 404 also serves as a buffer for storing packets fromall clients 205 that are associated with front-end 201. Blender 404mixes data packets coming into front-end 201 into a cohesive stream ofTMP packets sent to back-end 203 over TMP link 202. In creating a TMPpacket, blender 404 is able to pick and choose amongst the availableclient packets so as to prioritize some client packets over others.Prioritization is effected by selectively transmitting request andresponse data from multiple sources in an order determined by anpriority value associated with the particular request and response. Forpurposes of the present invention, any algorithm or criteria may be usedto assign a priority.

[0056] Also, where a front-end 201 and back-end 203 together are used toimplement an intermediary server 206, storage server interface 405 canimplement transport protocol algorithms that create a more efficientconnection between a front-end 201 and a back-end 203. Where a singleintermediary server 206 is used, however, interface 405 should implementprotocols that enable communication with storage servers 210 and storagemanagement servers 212.

[0057] Optionally, front-end 201, back-end 203, and/or intermediarycomputer 206 implement security processes, compression processes,encryption processes and the like to condition the received data forimproved transport performance and/or provide additional functionality.These processes may be implemented within any of the functionalcomponents (e.g., data blender 404) or implemented as separatefunctional components within front-end 201. Also, parser 402 mayidentify priority information transmitted with a request. Theprioritization value may be provided by the owners of data server 210,for example, and may be dynamically altered, statically set, or updatedfrom time to time to meet the needs of a particular application.Moreover, priority values may be computed to indicate aggregate priorityover time, and/or combine priority values from different sources tocompute an effective priority for each database request.

[0058] TMP is a TCP-like protocol adapted to improve performance formultiple channels operating over a single connection. The TMP mechanismin accordance with the present invention creates and maintains a stableconnection between two processes for high-speed, reliable, adaptablecommunication. Another feature of TMP is its ability to channel numerousTCP connections through a single TMP connection 202. The environment inwhich TMP resides allows multiple TCP connections to occur at one end ofthe system. These TCP connections are then combined into a single TMPconnection. The TMP connection is then broken down at the other end ofthe TMP pipe 202 in order to traffic the TCP connections to theirappropriate destinations. TMP includes mechanisms to ensure that eachTMP connection gets enough of the available bandwidth to accommodate themultiple TCP connections that it is carrying.

[0059]FIG. 5 illustrates data exchanges between entities in an exemplarydata access transaction in accordance with the present invention. Fromthe perspective of client 205, there is a simple request/responseexchange that is conducted with a front-end 201 (or intermediary server206). It is a valuable feature that client 205 need only be configuredto communicate conventional request/response exchanges as this usuallyenables a client 205 to use the present invention without modificationfrom existing network interface mechanisms. In other words, the clientdoes not need to implement specialized hardware or software.

[0060] In a web-based environment, client 205 displays a web page havinga number of hypertext links. The web page is generated by any of a webserver, front-end 201, or stored internally to the client. In aparticular example, the links include references to desired dataobjects. Because these links do not refer directly to aserver/directory/file name at which the data object is located, they arereferred to herein as “tokens”. Client 205 need not be aware of theactual location at which a data object is stored. The term “data object”as used herein refers broadly to any set of data stored at one or morespecific locations within network-connected or direct connected storagemechanisms. Data objects include single files, portions of files,sequences of files, and the like.

[0061] Front-end 201 receives client data requests and implementsprocesses to resolve the data request into a response. In cases of writeoperations, the response may be a confirmation or acknowledgment thatthe data was written to storage. In the case of a read request, theresponse will include requested data. However, front-end 201 lacks apriori knowledge of where the requested data resides, even if therequested data resides on direct connected storage such as data store211 or on a virtual database 211.

[0062] Front-end 201 sends or forwards the token associated with thedata request to storage management server 212. The token comprises adata structure that identifies the data that is subject of the request,and optionally identifies the requester (i.e., the client 205) and otherdata required by storage management server 212. In other embodiments,the token identifies on or more intended recipients of the data, whichmay include the requesting client 205. In response, storage managementserver 212 sends file location information to front-end 201. Frontend201 can use the file location information to locate and access thephysical storage device upon which the data is stored. For example,front-end 201 generates file requests and receives file responses from aparticular data store 211. Front-end 201 then sends generates and sendsa response to the requesting client 205.

[0063]FIG. 6 illustrates data exchanges between entities in an exemplarydata access transaction in accordance with an alternative implementationof present invention. In the implementation of FIG. 6, the front-end 201to which the original client data request is directed does not haveaccess to the requested data. However, front-end 201 is not aware ofthis until it accesses storage management sever 212 as described above.In the implementation of FIG. 6, front-end 201 uses the file locationinformation to generate a redirect response to client 205 that pointsthe client 205 to an alternative front-end 201 that can access therequested data. Protocols such as HTTP include redirect mechanisms thatmake the operation shown in FIG. 6 practical to implement withoutchanges to the software on client 205.

[0064] In response to the redirect from the first front-end 201, client205 generates a redirected request to the alternate front-end 201. Theredirected request may include additional file location information thatwould allow the alternative front-end 201 to access its available datastore(s) 211 directly. Alternatively, the alternate front-end 201 mayrefer to storage management server 212 to obtain file locationinformation in a manner similar to that described for FIG. 5. In eithercase, the alternate front-end 201 supplies the response to the client'sdata request.

[0065] The present invention supports a variety of implementations thatmeet specific needs of specific applications. In the primary embodimentsdescribed above, data is served to client in response to a clientrequest. Alternatively, data may be served to a computer other than theclient, such as participants in an online meeting, broadcast, ormulticast session either in response to client requests, or according toa programmed routine executing in a front-end 201. Hence, the presentinvention readily supports transfer of data objects from anetwork-connected data storage mechanism to any specifiednetwork-connected computer rather than simply returning data to thecomputer that requested the data object. This can be useful inpresentations and multimedia distribution using broadcast and multicast.For example, a first client 205 may issue a token that representsparticular data object to a front-end 201. The token may be accompaniedby an identification of one or more recipients for the data object.Alternatively, the front-end 201 may maintain the identification ofrecipients.

[0066] As yet another alternative, the client request including aparticular token may serve as trigger for further data transfers betweenand among front-ends 201, data stores 211, and clients 205. Certaintypes of data objects are either explicitly or implicitly related toother data objects. A presentation or online event, for example, oftenhas an explicit flow such that once a particular event is reached, thesystem can know with a high level of certainty what possible data objector objects will be requested after the current event. By way of exampleof an implicit data ordering, an initial client request may include atoken identifying a particular multimedia file such as a song or musicvideo. It can be anticipated that other songs on the same album or songsof a similar genre or artist are likely to be subject of subsequentrequests.

[0067] In accordance with an embodiment of the present invention, theinitial token sent by a client can be resolved to a pointer not only tothe particular data object that is subject of the current request, butcan include a secondary token indicating other data objects with aprobability of being requested in the future. This feature enablesfront-end 201 to proactively redistribute data into its cache 403, forexample, so that if and when the subsequent request is received, it canbe served more quickly. Alternatively, the front-end 201 and/or storagemanagement server may determine which data objects should be proactivelydistributed to front-end 201 in which case client 205 need not sendsecondary tokens.

[0068] Although the invention has been described and illustrated with acertain degree of particularity, it is understood that the presentdisclosure has been made only by way of example, and that numerouschanges in the combination and arrangement of parts can be resorted toby those skilled in the art without departing from the spirit and scopeof the invention, as hereinafter claimed. For example, while devicessupporting HTTP data traffic are used in the examples, the HTTP devicesmay be replaced or augmented to support other public and proprietaryprotocols including FTP, NNTP, SMTP, SQL and the like. In suchimplementations the front-end 201 and/or back end 203 are modified toimplement the desired protocol. Moreover, front-end 201 and back-end 203may support different protocols such that the front-end 201 supports,for example, HTTP traffic with a client and the back-end supports a DBMSprotocol such as SQL. Such implementations not only provide theadvantages of the present invention, but also enable a client to accessa rich set of network resources with minimal client software.

We claim:
 1. A data storage system comprising: a communication network;a client application coupled to the network and generating an accessrequest for stored data, wherein the client application lacks a prioriknowledge of the location of the requested data; an intermediary servercoupled to the network to receive the request; one or more data storagedevices accessible through the intermediary server and having aplurality of data units stored at selected locations therein; a storageserver having knowledge of the location of data units in the storagedevices and having an interface for communicating with the intermediaryservers; processes within the intermediary server responsive to areceived data access request for communicating with the storage serverto obtain knowledge about the location of requested data from the datain response to a received client request; and processes within theintermediary server for obtaining the data from the specific locationand serving the data to the requesting client application.
 2. The systemof claim 1 wherein the data is returned such that the client remainsunaware of the specific location of the data.
 3. The system of claim 1wherein the intermediary server has a lower latency connection to theclient application than does the storage server.
 4. The system of claim1 wherein at least some of the storage devices comprise direct attachedstorage for the intermediary server.
 5. The system of claim 1 wherein atleast some of the storage devices comprise network attached storage. 6.The system of claim 1 wherein at least some of the storage device areconfigured as a storage area network.
 7. The system of claim 1 whereinthe access request is represented by a token.
 8. The system of claim 1wherein the processes for communicating with the storage server furthercomprises transmission of a token representing the requested data. 9.The system of claim 1 wherein the processes for communicating with thestorage server further comprises processes for receiving a resourcelocator from the storage server.
 10. The system of claim 1 wherein theprocesses for communicating with the storage server further compriseprocesses for receiving a file name and file path from the intermediaryserver.
 11. A method for managing on-network data storage comprising theacts of: providing a communication network; receiving requests for datawithin an intermediary server from a plurality of external clientapplications coupled to the network; storing units of data in one ormore data storage devices accessible to the intermediary server;associating each storage request with a token representing the request;sending the token to a storage server coupled to the network and havingan interface for communicating with the intermediary server; causing thestorage server to return specific location information corresponding tothe request associated with the received token; causing the intermediaryserver to access the data storage mechanism using the specific locationinformation to retrieve data at the specific location; and deliveringthe retrieved data to the client application that generated the request.12. A method for transferring data between networkconnected computerscomprising the acts of: storing a data object at a specific location ina network-connected storage mechanism; transmitting a token representingthe data object from a first network-connected computer to anintermediary computer; in the intermediary computer, using the token toidentify the specific storage location of the data object; causing thestorage mechanism to transfer the data object to a secondnetwork-connected computer.
 13. The method of claim 12 wherein the stepof sending the token further comprises sending an identification of thesecond network-connected computer.
 14. The method of claim 12 whereinthe act of transferring the data object comprises transferring the dataobject to a plurality of network-connected computers.
 15. The method ofclaim 12 further comprising: storing copies of the data object atmultiple network-connected storage mechanisms; using the intermediarycomputer to select one of the multiple network-connected storagemechanisms; and causing the selected network-connected storage mechanismto transfer the data object to a second network-connected computer. 16.The method of claim 12 wherein the step of causing the storage mechanismto transfer the data object to a second network-connected computercomprises: transferring the data object to a front-end servertopologically close to the second network-connected computer; andtransferring the data object from the front-end server to the secondnetwork-connected computer.
 17. The method of claim 12 wherein the dataobject at the specific location is referred to as a primary data object,the method further comprising: causing the network-connected storagemechanism to proactively redistributed data objects by transferring inaddition to the primary data object, one or more data objects that aresequentially related to the primary data object.
 18. A data distributionservice comprising: one or more data storage mechanisms holding aplurality of data objects at specific non-public locations; an interfacefor receiving tokens, the tokens associated with particular ones of thedata objects and the tokens lacking specific location informationindicating the locations of the data objects in the one or more datastorage mechanisms; and in exchange for payment, supplying the specificnonpublic locations of the data objects associated with the receivedtokens.
 19. A method for version control of a data object comprising:receiving a token representing a first version of a data object; usingthe token to identify second version of the data object; and identifyinga specific storage location of the second version data object inresponse to the received token.